Change layout of values in a table
Source: data import cheat sheet : tidyr
Two main functions to manipulate the layout of the table, pivot_longer transforms the table from wide to long format and pivot_wider, which does the opposite, converts the table from long to wide format.
pivot_longer : wide to long formatThis function allows collapsing ‘similar’ variables into one variable while guaranteeing the data set’s consistency. For example take the variables pulse1 and pulse2 in the following subset of the pulse data set:
pulses <- pulse %>%
select(name,pulse1,pulse2) %>%
head(3)
pulses
# A tibble: 3 × 3
name pulse1 pulse2
<chr> <dbl> <dbl>
1 Bonnie 86 88
2 Melanie 82 150
3 Consuelo 96 176
We can transform the table as such that all pulse values are under a single variable, say pulse:
dfLong <- pulses %>% pivot_longer(c(pulse1, pulse2), names_to = "pulse", values_to = "level")
dfLong
# A tibble: 6 × 3
name pulse level
<chr> <chr> <dbl>
1 Bonnie pulse1 86
2 Bonnie pulse2 88
3 Melanie pulse1 82
4 Melanie pulse2 150
5 Consuelo pulse1 96
6 Consuelo pulse2 176
This is called the long version of the original (wide) table and contains the same information.
Alternatively you achieve the same result using ! (negation operator):
dfLong <- pulses %>% pivot_longer(!name, names_to = "pulse", values_to = "level")
dfLong
# A tibble: 6 × 3
name pulse level
<chr> <chr> <dbl>
1 Bonnie pulse1 86
2 Bonnie pulse2 88
3 Melanie pulse1 82
4 Melanie pulse2 150
5 Consuelo pulse1 96
6 Consuelo pulse2 176
AnswerCan you find ohter variables in the pulse data set which can be transformed to long format?
Let’s reshape the pulse dataset with variables drug = {smokes,alcohol} and use = {yes,no}.
pulse %>% pivot_longer(c(smokes,alcohol), names_to = "drug", values_to = "use")
# A tibble: 220 × 13
id name height weight age gender exercise ran pulse1 pulse2 year drug use
<chr> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr> <chr>
1 1993_A Bonnie 173 57 18 female moderate sat 86 88 1993 smokes no
2 1993_A Bonnie 173 57 18 female moderate sat 86 88 1993 alcohol yes
3 1993_B Melanie 179 58 19 female moderate ran 82 150 1993 smokes no
4 1993_B Melanie 179 58 19 female moderate ran 82 150 1993 alcohol yes
5 1993_C Consuelo 167 62 18 female high ran 96 176 1993 smokes no
6 1993_C Consuelo 167 62 18 female high ran 96 176 1993 alcohol yes
7 1993_D Travis 195 84 18 male high sat 71 73 1993 smokes no
8 1993_D Travis 195 84 18 male high sat 71 73 1993 alcohol yes
9 1993_E Lauri 173 64 18 female low sat 90 88 1993 smokes no
10 1993_E Lauri 173 64 18 female low sat 90 88 1993 alcohol yes
# … with 210 more rows
pivot_wider : long to wide formatdfWide <- dfLong %>% pivot_wider(names_from = "pulse", values_from = "level")
dfWide
# A tibble: 3 × 3
name pulse1 pulse2
<chr> <dbl> <dbl>
1 Bonnie 86 88
2 Melanie 82 150
3 Consuelo 96 176
Below, pulses tibble has been transformed into pulses2. What can you say about this transformation?
pulses2 <- pulses %>%
pivot_longer(!name, names_to = "pulse", values_to = "level") %>%
pivot_wider(names_from = "pulse", values_from = "level")
Answer
Copyright © 2021 Biomedical Data Sciences (BDS) | LUMC